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Abstract 


In the late 1990s a number of researchers noticed that networks in biology, sociology, and telecom- 
munications exhibited similar characteristics unlike standard random networks. In particular, they 
found that the cummulative degree distributions of these graphs followed a power law rather than a 
binomial distribution and that their clustering coefficients tended to a nonzero constant as the num- 
ber of nodes, n, became large rather than ()( 1 //;). Moreover, these networks shared an important 
property with traditional random graphs — as n becomes large the average shortest path length scales 
with log n. This latter property has been coined the small-world property. When taken together 
these three properties — small-world, power law, and constant clustering coefficient — describe what 
are now most commonly referred to as scale-free networks. Since 1997 at least six books and over 
400 articles have been written about scale-free networks. In this manuscript an overview of the 
salient characteristics of scale-free networks. Computational experience will be provided for two 
mechanisms that grow (dynamic) scale-free graphs. Additional computational experience will be 
given for constructing (static) scale -free graphs via a tabu search optimization approach. Finally, a 
discussion of potential applications to general aviation networks is given. 
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1 Background Information on Random Graphs 


The birth of the mathematical study of random graphs is generally attributed to the seminal work of 
Erdos and Renyi [29, 30]. Since then a steady stream of books and research articles have studied 
these mathematical objects. A graph is a collection of n nodes (vertices) and edges (links, arcs, 
bonds) that connect the nodes. In a random graph we are given a probability p that a pair of nodes 
has an edge connecting them. Assuming independence, the expected number of edges in a random 
graphs is p ■ n(n — l)/2. Typically these graphs are denoted G p or G np . There are a variety of 
interesting properties that have been discovered about random graphs. Three that will be central 
to our discussion are the average shortest path length, the degree distribution and the clustering or 
transitivity coefficient. 



Fig. 1. A Graph with 9 nodes and 9 edges 

The degree of a node in an undirected graph is the number of incident edges. For example, in 
Figure 1 the degree of node 1 is 5. The degree sequence is {1,2, 3, 5} and the fraction of the nodes 
of each degree k are p(k) = {5/9, 1/9, 2/9, 1/9} respectively. This latter set is called the degree 
distribution. For random graphs p(k) is known to be binomially distributed and, as n approaches 
infinity, Poisson distributed. Often the cummulative distribution P(k) = {1,4/9, 3/9, 1/9} where 
P(k) denotes the fraction of nodes with degree k or larger is of greater interest. 

The shortest path distances for an undirected graph with no edge weights is the number of edges 
between nodes on these paths. In general, an 0(n 3 ) algorithm is needed to compute the shortest path 
distance matrix. The shortest path distance matrix for the graph in Figure 1 is 
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There is one row and one column for each node. For random graphs it is known that the average 
shortest path distance scales with the log of the number of nodes. The third metric associated with 
random graphs of interest to us is the transitivity or clustering coefficient. The goal is to find a (0, 1) 
scale for keeping track of how many complete subgraphs on three nodes (triangles) are present. One 
way to do this is to compute 

j [three times total number of triangles] 

[total number of connected three tuples of nodes]. 

The three in the numerator means that triangles count as three distinct three tuples while the denom- 
inator is equivalent to counting the distinct paths of length two. A second popular method used to 
capture transitivity is to first compute for each node i 

[number of triangles in which node i is incident] 

' [number of three tuples of connected nodes centered on node i ] . 

Then we define = 1/n X, C)- For the graph in Figure 1, C 1 1 * = 1/5 and C (2,i = 11/90. In 
general, both clustering coefficients tend to (average degree)/ (n — 1) as n becomes large for G p 
graphs. Hence, as n goes to infinity the clustering coefficient goes to zero. 

In optimization modeling random graphs are a frequent mechanism for generating testbeds to 
compare the performance of competing algorithms. For example, in [42] two distinct classes of 
random graphs are generated to both tune the parameters of their implementation of simulated an- 
nealing for the graph partitioning problem and to compare its performance against other competing 
heuristics. The two classes of random graphs provided a set of repeatable experiments (both rely on 
a psuedorandom number generator and a seed value) and an easy way to test relatively large problem 
instances. Graphs with up to 1500 nodes were examined. 


2 Properties of Scale-Free Graphs 

In the late 1990s a number of researchers noticed that real graphs in biology, sociology, and telecom- 
munications exhibited similar graph characteristics unlike standard random graphs. In particular, 
they found that the cummulative degree distributions followed a power law rather than a binomial 
distribution and that the clustering coefficients tended to a nonzero constant as n became large rather 
than 0{\/n). Moreover, these real graphs shared the property with random graphs that as n be- 
comes large the average shortest path length scales with log n. This latter property has been coined 
the small-world property. When taken together these three properties — small-world, power law, 
and constant clustering coefficient — describe what are now most commonly referred to as scale-free 
graphs. 

As a result of these observations an explosion of activity commenced. Since 1998 several books 
have been dedicated to the topic including the popular books Linked by Barabasi [8], Six Degrees 
by Watts [64] and Nexus by Buchanan [17]. In addition, a number of excellent review articles exist 
including Newman [53], Strogatz [60], Albert and Barabasi [3] and Dorogovtsev and Mendes [24], 
These review articles contain hundreds of references. For example [53] contains 429 references. We 
note, as have many other authors, that graphs whose degree distributions follow a power law are not 
new. For example, Milgram’s [51] work with acquaintance networks in the United States which led 
to his conclusion there were six degrees of separation amongst the individuals studied. However, the 
early references to graphs following power laws are small in number when compared to the current 
explosion in interest. 

Table 1 provides a summary of essential characteristics of four real graphs. The interested reader 
is referred to Newman [53] for a more exhaustive table. Clearly the average shortest path lengths 
for the graphs in Table 1 grow slowly. For example, the value of 16. 18 for the Altavista world wide 
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web graph means that Altavista web pages are on average 16 clicks away even though there are 
potentially 2.1 billion edges to traverse. If the average shortest path length scales logarithmically or 
slower with the number of nodes in the graph then it is called a small-world graph (see Watts [64]). 
The natural logarithms of the numberof nodes for the graphs listed in Table 1 are 13.02, 19.13, 9.28, 
and 7.66 respectively. Consequently, all graphs listed in Table 1 exhibit the small-world property. 
However, this property alone does not capture scale-free graphs as traditional random graphs are 
also small-world. 


Graph 

Nodes 

Edges 

AvgDeg 

AvgSPD 


CP) 

Ref 

film actor 

449,913 

25,516,482 

113.443 

3.48 

0.20 

0.78 

11,12 

Altavista 

203,549,046 

2,130,000,000 

10.46 

16.18 

— 

— 

13 

Internet 

10,697 

31,992 

5.98 

3.31 

0.035 

0.39 

14,15 

protein 

2,115 

2,240 

2.12 

6.80 

0.072 

0.071 

16 


Table 1. Basic statistics for a number of published graphs 


We have already noted that the tail of the cummulative degree distributions for scale -free graphs 
follow a power law. This is easiest to see if P(k) versus k is plotted on a log-log scale. If the tail 
is following a power law the graph of the tail will be a line on the log-log scale where the slope 
of the line is the exponent of the power function. If instead the tail is following an exponential 
distribution then a log-linear plot will be nearly linear. Figure 2 is a log-linear plot of a cummulative 
degree distribution with an exponential tail. The graph that gave rise to Figure 2 was generated in 
the following way. First, 1000 random (x,y) coordinates were generated in a 100 by 100 unit square. 
Next, the first three of these points was chosen as the initial set of nodes. Two edges were added by 
connecting node 1 to node 2 and node 2 to node 3. The average degree of this initial graph is 4/3. 
Additional nodes were added one at a time. Edges were placed between the new node and existing 
nodes by selecting the nodes closest, with respect to Euclidean distance, to the new node. The 
number of new edges to be added is computed by taking the nearest integer of the average degree 
of the current graph. Real networks whose degree distributions display exponential tail behavior 
include the power grid of the Western United States (see [53] page 187). Figure 3 is a log-log plot 
of a cummulative degree distribution of a graph following a power law. As in the previous case the 
graph is generated by starting with the same three node two edge graph. Additional nodes are added 
one at a time. As before the number of edges to be added is computed as the nearest integer of the 
average degree of the current graph. The difference is that node connections are based upon roulette 
wheel selection. The roulette wheel is divided into sectors following the degree distribution of the 
current graph (ie. preferential attachment). Clearly Figure 3 is nearly linear and hence is associated 
with a scale-free graph. The codes used to generate both of these random graphs are provided in the 
Appendix. 

Table 1 also records the average degree of each graph in column 4. This number does not 
provide degree distributional information but it is still a useful metric to record. The average degree 
is used to estimate the clustering coefficient when the number of nodes is large. Moreover, in the 
next section we will use the average degree as one of the objectives in a bi-objective approach to 
generate scale-free graphs. The optimization approach is in contrast to the preferential attachment 
description of scale-free graphs promoted by Barabasi [8] and many others. Lastly, Table 1 records 
the two clustering coefficients in columns 6 and 7. It is impossible to verify whether or not these 
values are constant for these graphs as the number of nodes increases. An effective display of this 
property would be a plot of the clustering coefficient as the number nodes increases over time. Plots 
of this sort appear most often for scale-free graphs generated by some artificial mechanism that can 
be controlled. However, we can verify that the clustering coefficients in Table 1 are much larger than 
the expected value for a G p graph. For a G p graph the expected value of the clustering coefficient 
is the average degree divided by n — 1 . So, for example, if the film actor graph was similar to a G p 
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graph then the expected clustering coefficient would be 113.443/449,913 or .000252 which is three 
orders of magnitude smaller than either C 1 1 1 = 0.20 or C (2) = 0.78. 



Fig. 2. 2000 nodes, degree = current average degree, edges closest Euclidean nodes 



Fig. 3. 1000 nodes, edges added preferentially based on degree distribution 


3 Optimization and Scale-Free Graphs 

One of the captivating features of scale-free graphs is that they arise naturally without the aid of a 
human designer. Two intriguing questions are why and how? One partial answer to the why question 
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is error tolerance and communication. Communication between the nodes in a scale-free graph is 
carried out efficiently. One of the defining properties of scale-free graphs is that the length of the 
average shortest path is small. For example, in Table 1 we noted that in the Altavista graph with 
2.1 billion edges the average shortest path length is 16.18. It is not clear, however, exactly how the 
entities traversing these graphs know how to find these paths. Kleinberg [44], in a well referenced 
article, appears to be one of the first to study the path decision problem in scale-free graphs. He was 
able to characterize a class of scale -free graphs, similar to those studied by Watts and Strogatz [63], 
for which a decentralized algorithm is able to find shortest paths with high probability. Although 
Milgram [51] did not directly address this question his conclusions would not have been possible 
unless the human subjects involved had been able to solve their own shortest path problems. There is 
also a downside to the existence of such effective communication. The objects represented by scale- 
free graphs are susceptible to virus transmission. If all of the 200 million nodes in the Altavista graph 
are about 16 clicks away then it is easy to see why computer viruses are transmitted so quickly. 

Error tolerance is perhaps a less obvious criterion for the why question but is critical to the 
survival of real graphs. A variety of authors have noted that scale-free graphs are highly resistant 
to random edge and/or node failures. See, for example, [2], [21] and [13]. Scale-free graphs are, 
however, powerless against targeted node failures. That is, if an intelligent entity takes down the 
nodes of highest degree scale-free networks disconnect quickly. In response to this latter weakness 
several researchers have studied ways to strengthen scale-free graphs (see [18] and [57]). 

The how question for scale-free graphs is still far from resolved. Initially the focus was on 
preferential attachment models which is how Figure 3 was generated. However, this left many re- 
searchers unconvinced and has led to bi-objective optimization models as well as dynamical systems 
models [39] and statistical mechanics models [3] A few authors have used some of the metrics listed 
in Table 1 as their objectives in a bi-objective approach to generate scale-free graphs. For exam- 
ple, Ferrer-Cancho and Sole [33] minimize a linear combination of average shortest path distance 
and the total number of edges in the graph. They show that, by varying the objective function 
weights, the resulting graphs vary from trees to scale-free graphs to a star graph (one node of de- 
gree n — 1 and all other nodes of degree 1). A similar result is obtained by Mathias and Gopal 
[50], Sole et al [59] provide a discussion of evolving biological systems and the role of optimiza- 
tion in that context. In addition, they point out that preferential attachment does not provide an 
adequate explanation for the large clustering coefficients observed in real networks. They believe 
that clustering is a side effect of optimization. They conjecture that reliable communication and 
cost minimizing shapes are the organizing principles behind scale-free graphs. Following [33] Fig- 
ure 4 is the log-log plot of the cummulative distribution of a scale-free graph found by minimizing 
(X ■ total number edges + ( 1 — X) • average number of path edges) with X = 0.50 via a tabu search 
heuristic. In contrast if we replace the total number of edges objective with average degree we can 
achieve the same result. Figures 5, 6 and 7 provide an example of a scale-free graph that arises by 
minimizing (X ■ average degree + ( 1 — X} t average number of path edges) with X = 0.65. As before 
the minimization was done via a tabu search strategy. Figure 5 illustrates the actual graph while fig- 
ure 6 gives the cummulative degree distribution. Figure 6 is roughly linear indicating that the graph 
in Figure 5 is scale-free. Figure 7 is a histogram of the degree distribution of the graph in Figure 5. 

Tabu search is a metaheuristic strategy that can be used to control a variety of local search 
heuristics. Tabu search seeks to exploit historical information gathered during the local search phase 
so that the search will not remain stalled at a local optima. There are a variety of mechanisms 
developed to avoid local optima including tabu lists, recency and frequency based diversification 
schemes, and the detection of basins of attraction. Although the roots of tabu search are present in 
a variety of early work in artificial intelligence and operations research the seminal paper is Glover 
[34], Glover [35] and Glover and Laguna [36] provide a comprehensive list of techniques and 
applications associated with tabu search specifically and adaptive memory programming generally. 

A plain vanilla tabu search algorithm was developed to minimize a bi-objective based on any 
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two of the following five objectives: average degree, total number of edges, average shortest path 
distance, average shortest number of edges (hop length), and average shortest Euclidean distance. 
The user must provide the adjacency matrix of the initial graph, the objectives to be considered, the 
value of X for the convex combination of the two objectives, how long a move is to remain tabu, and 
the maximum number of iterations allowed. The neighborhood of a solution (available moves) is all 
0(n 2 ) bit flips in the adjacency matrix. Since we are considering undirected graphs the adjacency 
matrix is symmetric. If the (i.j) entry is a 1 then there is an edge between nodes i and j and so 
the (j, i) entry must also be 1. The most improving move available at each iteration is selected. A 
move is made unavailable (tabu) for a pre-determined number of iterations. In subsequent iterations 
the inverse move is disallowed so as to avoid (hopefully) a return to a previously observed solution. 
We maintained two tabu lists, one for the (i. j) bit flip moves and one for each row of the adjacency 
matrix. The idea behind the row tabu list is to avoid focusing the search on a particular node. For 
graphs with more than 50 nodes the computation of the average shortest hop length (or average 
shortest path length) is expensive — 0(n 2 ) — to compute. For the complete 0(n 2 ) neighborhood 
this would result in 0(n 5 ) work per iteration. Consequently, we chose to examine a randomly 
chosen subset of the 0(n 2 ) moves. Typically we examined about 20 percent of the neighborhood 
until progress slowed and we are in the basin of attraction of a locally (hopefully globally) optimal 
solution. At this point we switched to a complete neighborhood search for the last n iterations. Tabu 
search is by definition a heuristic so there is no guarantee that the plots in Figure 4 and 5 are for the 
globally optimal solutions. 

We experimented with a number of objective function combinations. But only two combinations — 
the average degree and average hop length as well as the total number of edges and average hop 
length — resulted in scale-free graphs. We had hoped that the average shortest Euclidean distance 
would have resulted in a scale-free graphs since it is much more efficient to compute and update. 
In addition, we found that the scale-free property was sensitive to our choice of X. Consequently, 
we are not convinced that such an optimization procedure accurately depicts what real scale-free 
networks are doing when they organize themselves. There is also no reason to suspect that only 
two objectives are involved and that these objectives must be in linear combination. An analogous 
situation arises in multiobjective decision theory. One approach is to ask decision makers a series 
of questions to enable the modeler to uncover the utility function of the decision maker. The utility 
function provides a mathematics mapping of the competing objectives into a single objective. Util- 
ity functions are difficult to determine and assumptions of linearity are often difficult to justify. The 
interested reader is referred to Keeney and Raiffa [58] for more information about utility theory and 
decision making. 
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Fig. 5. Tabu Search result: 50 nodes, X = 0.65 





Fig. 6. Cummulative Degree Distribution: 50 nodes, X = 0.65 
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Fig. 7. Degree Distribution: 50 nodes, X = 0.65 


4 Graphs for General Aviation 

The remaining question is what, if anything, can scale-free graphs be used to model in general 
aviation? To begin to answer this question we refer to two network design analyses. The first, 
Lederer and Nambimadom [47], compare the performance of four different network types-hub and 
spoke, direct, complete tour and a collection of subtours. Several simplifying assumptions are made. 
One assumption is that all n cities to lie equidistant from each other on a circle of fixed radius. In 
addition the demand for service is assumed to be identical for all origin-destination node pairs. The 
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hub lies at the center of the circle and is not allowed to be a destination. It should be clear how 
the hub and spoke network as well as the direct network routes are implemented. The complete tour 
routing does not utilize the hub at all. Instead it is a traversal of the n cities in one cycle. A collection 
of subtours partitions the n cities into k subcollections. Each subcollection is serviced by beginning 
at the hub and forming a cycle that includes all the cities in the subcollection. The performance 
measure is to maximize profit which is met by minimizing the sum of airline and passengers’ cost 
subject to a fixed demand for service. 

Lederer and Nambimadom [47] found that the optimal network choice was sensitive to three 
problem parameters — demand, distance between cities, and number of cities. If either demand or 
distance is small then a direct network is best, if either distance or demand is large than a complete 
tour is best. Intermediate values led to the other two networks. If the number of cities is small a 
direct network is best. If the number of cities is large a hub and spoke network wins. Intermediate 
values for the number of cities led to the other networks as best. In addition, they noted that the 
time built into a schedule to protect against delays is smaller for direct networks and that the direct 
networks were also the most reliable (e.g. Southwest Airlines). The authors also found that if fixed 
and variable costs have constant returns to scale then a direct network is optimal under the modeling 
assumptions. The scale-free community has also considered ring lattices. Watts and Strogatz [63] 
began the work in this direction. Figure 8 displays such a graph. The graphs are characterized by 
the degree of each node (all are identical initially) and a probability that an edge will be rewired 
to create a shortcut across the circle. Since these graphs have been well studied in the scale-free 
literature they provide a natural avenue for further study following Lederer’s approach. 



Fig. 8. Ring lattice with shortcuts. 

The second analysis is by Yang and Kornfeld [65]. They compare the performance of a hub 
and spoke network to other competing networks for overnight package delivery. The authors had 
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planned to test their model on 26 major cities but found that the resulting mixed-integer program was 
too large to solve (about 2,000 integer variables and 17,000 non-integer variables). Consequently, 
they focused on two subsets of seven cities — the first seven cities in the alphabetical list and the 
largest seven cities in the list. The optimal solution for the first seven cities is one hub in Atlanta 
and three partial hubs. For the seven largest cities (with respect to cargo) the solution contains one 
hub in New York City and four partial hubs. Next the authors examined 5 cities placed on a circle. 
By varying the demand pattern a variety of network configurations proved optimal. We note that it 
should be possible to solve (or nearly solve) much larger networks for the Yang and Kornfeld model. 
Column generation techniques for large 0—1 integer linear programs have proven quite effective in 
generating high quality solutions when over a million decision variable are possible. For example, 
Barnhart and Schneur [10] solve a network design problem (hub and spoke) for an express mail 
carrier in which the linear programming relaxation had 800,000 columns of which 20,000 were 
chosen and passed on to the integer programming solver. The resulting gap between the solution 
generated by this approach and the true optimal solution was shown to be less than 0.5 percent. 

In contrast to these computational studies there are standard combinatorial problems whose study 
may also bring some light to the issues in designing an airline routing network for general aviation. 
In particular, Korte and Vygen [46] devote a chapter to network design problems in their recent 
research monograph. In this chapter the survivable network design problem is of interest. It is posed 
as follows. We are given an undirected graph G with edge weights and connectivity requirements 
r(i. j) for all pairs of nodes i and The goal is to find a minimum weight spanning subgraph 
of G such that for every pair of nodes i and j there exist r(i.j) edge-disjoint paths from i to j in 
the subgraph. The r(i. j) values provide a specified measure of redundant connections between the 
nodes in the graph, thus protecting the graph from edge failures. As expected the survivable network 
design problem is NP-hard (the Steiner tree problem is a special case). 

The general aviation network design problem will share some similarities with the overnight 
package delivery model of Yang and Kornfeld [65], That is, we expect there will not be regular 
planned routes and connections between origin and destination pairs. Instead, each day (or several 
times a day) demands for service will be collected and a route network will be constructed. Ide- 
ally the routing would be from door to door, not simply from airport to airport. There are many 
more questions at this point than answers. If we are constructing the network via an optimization 
approach, what are the objectives and constraints? The previously mentioned studies both fail to 
consider large enough sets of nodes to know whether scale-free graphs are a viable alternative. (A 
maximum of 24 for Lederer and Nambimadom [47] and only 7 for Yang and Kornfeld [65]). 

We close with Table 2, a broad category summary of our literature review of scale-free networks 
and general aviation. 


Category 

References 

surveys/books 

[53, 24, 3, 60, 17, 64, 8] 

optimization 

[59, 31, 19, 49, 50, 33, 61] 

construction 

[7, 1, 25, 43, 62, 48, 45, 26, 14, 54] 

dynamical systems 

[28, 39] 

robustness/fragility 

[2,21, 13, 22, 57, 18,56] 

algorithmic/mathematics 

[44, 37, 23, 11, 12,40, 52] 

empirical results 

[4, 6, 9, 55] 

airline network design 

[65, 47, 10, 38, 27, 16, 5] 


Table 2. Reference categories for scale-free networks and general aviation. 
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